Search CORE

1,659 research outputs found

On the Properties of Simulation-based Estimators in High Dimensions

Author: Guerrier Stéphane
Karemera Mucyo
Orso Samuel
Victoria-Feser Maria-Pia
Publication venue
Publication date: 01/01/2018
Field of study

Considering the increasing size of available data, the need for statistical methods that control the finite sample bias is growing. This is mainly due to the frequent settings where the number of variables is large and allowed to increase with the sample size bringing standard inferential procedures to incur significant loss in terms of performance. Moreover, the complexity of statistical models is also increasing thereby entailing important computational challenges in constructing new estimators or in implementing classical ones. A trade-off between numerical complexity and statistical properties is often accepted. However, numerically efficient estimators that are altogether unbiased, consistent and asymptotically normal in high dimensional problems would generally be ideal. In this paper, we set a general framework from which such estimators can easily be derived for wide classes of models. This framework is based on the concepts that underlie simulation-based estimation methods such as indirect inference. The approach allows various extensions compared to previous results as it is adapted to possibly inconsistent estimators and is applicable to discrete models and/or models with a large number of parameters. We consider an algorithm, namely the Iterative Bootstrap (IB), to efficiently compute simulation-based estimators by showing its convergence properties. Within this framework we also prove the properties of simulation-based estimators, more specifically the unbiasedness, consistency and asymptotic normality when the number of parameters is allowed to increase with the sample size. Therefore, an important implication of the proposed approach is that it allows to obtain unbiased estimators in finite samples. Finally, we study this approach when applied to three common models, namely logistic regression, negative binomial regression and lasso regression

arXiv.org e-Print Archive

Archive ouverte UNIGE

A simple recipe for making accurate parametric inference in finite sample

Author: Guerrier Stéphane
Karemera Mucyo
Orso Samuel
Victoria-Feser Maria-Pia
Publication venue
Publication date: 01/01/2019
Field of study

Constructing tests or confidence regions that control over the error rates in the long-run is probably one of the most important problem in statistics. Yet, the theoretical justification for most methods in statistics is asymptotic. The bootstrap for example, despite its simplicity and its widespread usage, is an asymptotic method. There are in general no claim about the exactness of inferential procedures in finite sample. In this paper, we propose an alternative to the parametric bootstrap. We setup general conditions to demonstrate theoretically that accurate inference can be claimed in finite sample

arXiv.org e-Print Archive

Archive ouverte UNIGE

A Flexible Bias Correction Method based on Inconsistent Estimators

Author: Guerrier Stéphane
Karemera Mucyo
Ma Yanyuan
Orso Samuel
Victoria-Feser Maria-Pia
Zhang Yuming
Publication venue
Publication date: 16/04/2022
Field of study

An important challenge in statistical analysis lies in controlling the estimation bias when handling the ever-increasing data size and model complexity. For example, approximate methods are increasingly used to address the analytical and/or computational challenges when implementing standard estimators, but they often lead to inconsistent estimators. So consistent estimators can be difficult to obtain, especially for complex models and/or in settings where the number of parameters diverges with the sample size. We propose a general simulation-based estimation framework that allows to construct consistent and bias corrected estimators for parameters of increasing dimensions. The key advantage of the proposed framework is that it only requires to compute a simple inconsistent estimator multiple times. The resulting Just Identified iNdirect Inference estimator (JINI) enjoys nice properties, including consistency, asymptotic normality, and finite sample bias correction better than alternative methods. We further provide a simple algorithm to construct the JINI in a computationally efficient manner. Therefore, the JINI is especially useful in settings where standard methods may be challenging to apply, for example, in the presence of misclassification and rounding. We consider comprehensive simulation studies and analyze an alcohol consumption data example to illustrate the excellent performance and usefulness of the method

arXiv.org e-Print Archive

Parameter Determination of Sensor Stochastic Models under Covariate Dependency

Author: Clausen Philipp
Guerrier Stephane
Orso Samuel
Skaloud Jan
Publication venue
Publication date: 18/06/2018
Field of study

The proliferation of (low-cost) sensors provokes new challenges in data fusion. This is related to the correctness of stochastic characterization that is a prerequisite for optimal estimation of parameters from redundant observations. Different (statistical) methods were developed to estimate parameters of complex stochastic models. To cite a few, there is the maximum likelihood approach estimated via the so-called EM algorithm as well as a linear regression approach based on the log-log-representation of a quantity called Allan Variance. Nevertheless, all these methods suffer from various limitations ranging from numerical instability and computational inefficiency to statistical inconsistency. The relative recent approach called Generalized Method of Wavelet Moments (GMWM) that makes a use of the Wavelet Variance (WV) quantity of the error signal was proven to estimate stochastic models of considerable complexity in a numerically stable and statistically consistent manner with good computational efficiency. The situation is more challenging when stochastic errors are dependent on external factors (e.g. temperature, pressure, dynamics). This paper presents the essence of mathematical extension of the GMWM estimator that allows handling such a scenario rigorously by taking the external influences into consideration. We present the model of the multivariate stochastic process that composes firstly of the process of interest (signal of a sensor) and secondly of an explanatory process (e.g. environmental variable), where the latter is believed to have an impact on the stochastic properties of the former. Next, we assume that the input is composed of a real-valued “smooth” function dependent on external influence (values of which are perfectly observed) and a zero-mean process that is itself a sum of several independent latent processes. Then we define the covariate-dependent latent process (e.g. change of variance of white noise or auto-regressive process) as a class of piece-wise covariate-dependent latent time series models described by n-parameters. We propose to estimate the underlying vector parameter of interest using a modified version of the GMWM methodology that considers linear approximation of the dependency between noise parameters and the external influence. The intuition behind the new GMWM estimator is to select the parameter values that match the empirical WV on the data with the theoretical WV (i.e. those generated by the model parameters). We briefly demonstrate the asymptotic properties of the estimated parameter vector as well the consistency of the estimator

Infoscience - École polytechnique fédérale de Lausanne

Wavelet-Based Moment-Matching Techniques for Inertial Sensor Calibration

Author: Bakalli Gaetan
Guerrier Stéphane
Jurado Juan
Kabban Christine M. Schubert
Karemera Mucyo
Khaghani Mehran
Molinari Roberto
Orso Samuel
Raquet John
Skaloud Jan
Xu Haotian
Zhang Yuming
Publication venue
Publication date: 16/11/2019
Field of study

The task of inertial sensor calibration has required the development of various techniques to take into account the sources of measurement error coming from such devices. The calibration of the stochastic errors of these sensors has been the focus of increasing amount of research in which the method of reference has been the so-called "Allan variance slope method" which, in addition to not having appropriate statistical properties, requires a subjective input which makes it prone to mistakes. To overcome this, recent research has started proposing "automatic" approaches where the parameters of the probabilistic models underlying the error signals are estimated by matching functions of the Allan variance or Wavelet Variance with their model-implied counterparts. However, given the increased use of such techniques, there has been no study or clear direction for practitioners on which approach is optimal for the purpose of sensor calibration. This paper formally defines the class of estimators based on this technique and puts forward theoretical and applied results that, comparing with estimators in this class, suggest the use of the Generalized Method of Wavelet Moments as an optimal choice

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

AFTI Scholar (Air Force Institute of Technology)

CARBONO NOS COMPONENTES DA BIOMASSA DE Acacia mearnsii De Wild.

Author: Behling Alexandre
Corte Ana Paula Dalla
Coutinho Vinícius Morais
da Silva Samuel Alves
Martins João Felipe Cardozo
Orso Gabriel Agostini
Publication venue: 'Universidade Federal do Parana'
Publication date: 21/08/2019
Field of study

A fixação de carbono na biomassa de espécies florestais pode ser uma importante aliada no combate às mudanças climáticas. Assim, o conhecimento da proporção de carbono nos componentes desta espécie está estreitamente relacionado à precisão da quantificação dos estoques de carbono. Nesse sentido, considerando a aptidão econômica e silvicultural da Acacia mearnsii De Wild. no Brasil, o objetivo do presente estudo é caracterizar a fixação de carbono na biomassa de acácia-negra, por meio da determinação dos teores médios de carbono (TMC) para fuste (TCF) e copa (TCC) e a avaliação da influência da idade e local dos plantios dos mesmos. Para isso, foram amostradas 671 árvores distribuídas em povoamentos com idade entre 1 e 10,75 anos, localizadas em três diferentes locais na região sudeste do Rio Grande do Sul. Árvores foram agrupados proporcionalmente em 4 classes da idade nomeados: Jovem, Média Inicial, Média Avançada e Madura. Os teores de carbono foram obtidos com método destrutivo por meio do analisador de carbono (C-144, LECO). Os valores de TMC variaram entre 44,77% e 46,43%, em relação aos componentes copa e fuste, em todos os grupos de idade a copa apresentou valores estatisticamente superiores ao do fuste. Os fatores local e idade apresentaram efeito ao se tratar do TCC, já para o TCF, apenas o fator local apresentou efeito. A espécie acácia-negra fixa carbono em sua biomassa de modo semelhante aos das principais espécies para o setor florestal brasileiro, o que indica grande potencial para projetos que visam a fixação de carbono na biomassa

Biblioteca Digital de Periódicos da UFPR (Universidade Federal do Paraná)